Skip to content

SNOW-2333472: More hybrid benchmark fixes#3771

Merged
sfc-gh-joshi merged 1 commit into
mainfrom
joshi-SNOW-2333472-more-bmark-fixes
Sep 15, 2025
Merged

SNOW-2333472: More hybrid benchmark fixes#3771
sfc-gh-joshi merged 1 commit into
mainfrom
joshi-SNOW-2333472-more-bmark-fixes

Conversation

@sfc-gh-joshi

Copy link
Copy Markdown
Contributor
  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-2333472

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
    • If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
  3. Please describe how your code solves the related issue.

This PR addresses failures in rename and setting index (items 1 and 2 in the ticket). Items 3 and 4 are test setup issues, addressed in the benchmarking repo here: https://github.com/snowflakedb/snowpark-pandas-performance/pull/105.

I reran the hybrid CSV generation as well, but the size of the csv file increased, possibly due to other pandas tests being added before this PR was made. I'll omit updates to it for now, but we may want to do a more comprehensive audit to find internal points where we should use new_snow_series/new_snow_df (which disable auto-switching) instead of pd.Series/pd.DataFrame, since glancing at the error output seemed to indicate a large body of those errors.

We cannot do a bulk find-and-replace for these constructors, since in some cases wrapping a result in a Series/DF is the last operation before returning from a frontend method, and auto-switching may or may not be desirable. However, this is arguably a bug since explicit switching should only be decided by the QC caster; if so, we should consider updating modin to always disable casting before calling a frontend method.

@sfc-gh-joshi sfc-gh-joshi requested a review from a team as a code owner September 11, 2025 22:46
@sfc-gh-joshi sfc-gh-joshi added NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md NO-PANDAS-CHANGEDOC-UPDATES This PR does not update Snowpark pandas docs labels Sep 11, 2025
@sfc-gh-joshi sfc-gh-joshi merged commit 828f8eb into main Sep 15, 2025
38 of 40 checks passed
@sfc-gh-joshi sfc-gh-joshi deleted the joshi-SNOW-2333472-more-bmark-fixes branch September 15, 2025 22:51
@github-actions github-actions Bot locked and limited conversation to collaborators Sep 15, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md NO-PANDAS-CHANGEDOC-UPDATES This PR does not update Snowpark pandas docs snowpark-pandas

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants